<!--begin.rcode results='hide', echo=FALSE, message=FALSE
library(caret)
data(BloodBrain)

session <- paste(format(Sys.time(), "%a %b %d %Y"),
                 "using caret version",
                 packageDescription("caret")$Version,
                 "and",
                 R.Version()$version.string)
    end.rcode-->

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<!--
    Design by Free CSS Templates
    http://www.freecsstemplates.org
    Released for free under a Creative Commons Attribution 2.5 License

    Name       : Emerald 
    Description: A two-column, fixed-width design with dark color scheme.
    Version    : 1.0
    Released   : 20120902

  -->
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta name="keywords" content="" />
    <meta name="description" content="" />
    <meta http-equiv="content-type" content="text/html; charset=utf-8" />
    <title>Parallel Processing</title>
    <link href='http://fonts.googleapis.com/css?family=Abel' rel='stylesheet' type='text/css'>
    <link href="style.css" rel="stylesheet" type="text/css" media="screen" />
  </head>
  <body>
    <div id="wrapper">
      <div id="header-wrapper" class="container">
	<div id="header" class="container">
	  <div id="logo">
	    <h1><a href="#">Parallel Processing</a></h1>
	  </div>
          <!--
	      <div id="menu">
		<ul>
		  <li class="current_page_item"><a href="#">Homepage</a></li>
		  <li><a href="#">Blog</a></li>
		  <li><a href="#">Photos</a></li>
		  <li><a href="#">About</a></li>
		  <li><a href="#">Contact</a></li>
		</ul>
	      </div>
              -->
	</div>
	<div><img src="images/img03.png" width="1000" height="40" alt="" /></div>
      </div>
      <!-- end #header -->
      <div id="page">
	<div id="content">
	

<p>
In this package, resampling is primary approach for optimizing predictive models with tuning parameters. To do this, many alternate versions of the training set are used to train the model and predict a hold-out set. This process is repeated many times to get performance estimates that generalize to new data sets. Each of the resampled data sets is independent of the others, so there is no formal requirement that the models must be run sequentially. If a computer with multiple processors or cores is available, the computations could be spread across these "workers" to increase the computational efficiency. <a href="http://cran.r-project.org/web/packages/caret/index.html"><strong>caret</strong></a> leverages one of the parallel processing frameworks in R to do just this. The <a href="http://cran.r-project.org/web/packages/foreach/index.html"><strong>foreach</strong></a> package allows R code to be run either sequentially or in parallel using several different technologies, such as the <a href="http://cran.r-project.org/web/packages/multicore/index.html"><strong>multicore</strong></a> or <a href="http://cran.r-project.org/web/packages/Rmpi/index.html"><strong>Rmpi</strong></a> packages (see <a href="http://www.jstatsoft.org/v31/i01/paper">Schmidberger <i>et al</i>, 2009 </a> for summaries and descriptions of the available options). There are several R packages that work with <a href="http://cran.r-project.org/web/packages/foreach/index.html"><strong>foreach</strong></a>  to implement these techniques, such as <a href="http://cran.r-project.org/web/packages/doMC/index.html"><strong>doMC</strong></a> (for <a href="http://cran.r-project.org/web/packages/multicore/index.html"><strong>multicore</strong></a>) or <a href="http://cran.r-project.org/web/packages/doMPI/index.html"><strong>doMPI</strong></a> (for <a href="http://cran.r-project.org/web/packages/Rmpi/index.html"><strong>Rmpi</strong></a>). 
</p>
<p>
To tune a predictive model using multiple workers, the function syntax in the <a href="http://cran.r-project.org/web/packages/caret/index.html"><strong>caret</strong></a> package functions (e.g. <span class="mx funCall">train</span>, <span class="mx funCall">rfe</span> or <span class="mx funCall">sbf</span>) do not change. A separate function is used to "register" the parallel processing technique and specify the number of workers to use. For example, to use the <a href="http://cran.r-project.org/web/packages/multicore/index.html"><strong>multicore</strong></a>) package (not available on Windows) with five cores on the same machine, the package is loaded and them registered:
</p>
<!--begin.rcode eval=FALSE
library(doMC)
registerDoMC(cores = 5)
## All subsequent models are then run in parallel
model <- train(y ~ ., data = training, method = "rf")
    end.rcode-->
<p>
The syntax for other packages associated with <a href="http://cran.r-project.org/web/packages/foreach/index.html"><strong>foreach</strong></a> is very similar. Note that as the number of workers increases, the memory required also increase. For example, using five workers would keep a total of six versions of the data in memory. If the data are large or the computational model is demanding, performance can be affected if the amount of required memory exceeds the physical amount available. Also, for <span class="mx funCall">rfe</span> and <span class="mx funCall">sbf</span>, these functions may call <span class="mx funCall">train</span> for some models. In this case, registering <i>M</i> workers will actually invoke <i>M</i><sup>2</sup> total processes.
</p>
<p>
Does this help reduce the time to fit models? A moderately sized data set (4331 rows and 8) was modeled multiple times
with different number of workers for several models. Random forest was
used with 2000 trees and tuned over 10 values of <i>m</i><sub>try</sub>. Variable
importance calculations were also conducted during each model
fit. Linear discriminant analysis was also run, as was a
cost-sensitive radial basis function support vector machine (tuned
over 15 cost values). All models were tuned using five repeats of
10-fold cross-validation. The results are shown in the figure below. The y-axis corresponds to the total execution
time (encompassing model tuning and the final model fit) versus the
number of workers. Random forest clearly took the longest to train and
the LDA models were very computationally efficient. The total time (in
minutes) decreased as the number of workers increase but stabilized
around seven workers. The data for this plot were generated in a
randomized fashion so that there should be no bias in the run
order. The bottom right panel shows the <i>speed-up</i> which is the
sequential time divided by the parallel time. For example, a speed-up
of three indicates that the parallel version was three times faster
than the sequential version. At best, parallelization can achieve
linear speed-ups; that is, for <i>M</i> workers, the parallel time is
1/<i>M</i>. For these models, the speed-up is close to linear until four
or five workers are used. After this, there is a small improvement in
performance.  Since LDA is already computationally efficient, the
speed-up levels off more rapidly than the other models. While not
linear, the decrease in execution time is helpful - a nearly 10 hour
model fit was decreased to about 90 minutes.
</p>
<p><img width = 600 height =450 src="parallel.png"></p>
<p>
Note that some models, especially those using the <a href="http://cran.r-project.org/web/packages/RWeka/index.html"><strong>RWeka</strong></a> package, may not be able to be run in parallel due to the underlying code structure. 
</p>
<p>
<span class="mx funCall">train</span>, <span class="mx funCall">rfe</span>, <span class="mx funCall">sbf</span>, <span class="mx funCall">bag</span> and
<span class="mx funCall">avNNet</span> were given an additional argument in their respective
control files called <span class="mx arg">allowParallel</span> that defaults to
<code>TRUE</code>. When <code>TRUE</code>, the code will be executed in parallel
if a parallel backend (e.g. <strong>doMC</strong>) is registered. When
<span class="mx arg">allowParallel</span><code> = FALSE</code>, the parallel backend is always
ignored. The use case is when <span class="mx funCall">rfe</span> or <span class="mx funCall">sbf</span> calls
<span class="mx funCall">train</span>. If a parallel backend with <i>P</i> processors is being used,
the combination of these functions will create <i>P</i><sup>2</sup> processes. Since
some operations benefit more from parallelization than others, the
user has the ability to concentrate computing resources for specific
functions.
<p>
One additional "trick" that <span class="mx funCall">train</span> exploits to increase computational efficiency is to use sub-models; a single model fit can produce predictions for multiple tuning parameters. For example, in most implementations of boosted models, a model trained on <i>B</i> boosting iterations can produce predictions for models for iterations less than <i>B</i>. Suppose a <span class="mx funCall">gbm</span> model was tuned over the following grid:
</p>


<!--begin.rcode tidy=FALSE
gbmGrid <-  expand.grid(interaction.depth = c(1, 5, 9),
                        n.trees = (1:15)*100,
                        shrinkage = 0.1,
                        n.minobsinnode = 20)
    end.rcode-->
    
<!--begin.rcode results='hide', echo=FALSE, message=FALSE
mods <- getModelInfo()
isSeq <- unlist(lapply(mods, function(x) !is.null(x$loop)))
isSeq <- names(isSeq)[isSeq]

mods <- paste(sort(paste('<code>', isSeq, '</code>', sep = "")), collapse = ', ')
    end.rcode-->

<p>
In reality, <span class="mx funCall">train</span> only created objects for 3 models and derived the other predictions from these objects. This trick is used for the following models: <!--rinline I(mods) -->.
</p>

	  <div style="clear: both;">&nbsp;</div>
	</div>
	<!-- end #content -->
<div id="sidebar">
<ul>
  <li>
    <h2>General Topics</h2>
    <ul>
      <li><a href="index.html">Front Page</a></li>
      <li><a href="visualizations.html">Visualizations</a></li>
      <li><a href="preprocess.html">Pre-Processing</a><li>
      <li><a href="splitting.html">Data Splitting</a></li>
      <li><a href="varimp.html">Variable Importance</a></li>
      <li><a href="other.html">Model Performance</a></li>
      <li><a href="parallel.html">Parallel Processing</a></li>
    </ul>
    <h2>Model Training and Tuning</h2>
    <ul>
      <li><a href="training.html">Basic Syntax</a></li>
      <li><a href="modelList.html">Sortable Model List</a></li>
      <li><a href="bytag.html">Models By Tag</a></li>
      <li><a href="similarity.html">Models By Similarity</a></li>
      <li><a href="custom_models.html">Using Custom Models</a></li>
      <li><a href="sampling.html">Sampling for Class Imbalances</a></li> 
      <li><a href="random.html">Random Search</a></li> 
      <li><a href="adaptive.html">Adaptive Resampling</a></li> 
    </ul>
    <h2>Feature Selection</h2>
    <ul>
      <li><a href="featureselection.html">Overview</a>
      <li><a href="rfe.html">RFE</a></li>
      <li><a href="filters.html">Filters</a></li>
      <li><a href="GA.html">GA</a></li>
      <li><a href="SA.html">SA</a></li>
    </ul>  
  </li>
</ul>
</div>
<!-- end #sidebar -->
	<div style="clear: both;">&nbsp;</div>
      </div>
      <div class="container"><img src="images/img03.png" width="1000" height="40" alt="" /></div>
      <!-- end #page -->
    </div>
    <div id="footer-content"></div>
    <div id="footer">
      <p>Created on <!--rinline I(session) -->.</p>
    </div>
    <!-- end #footer -->
  </body>
</html>
